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Abstract. We review theoretical approaches, experiments and numerical simulations that have been 
recently proposed to investigate the folding problem in single-domain proteins. From a theoretical 
point of view, we emphasize the energy landscape approach. As far as experiments are concerned, 
we focus on the recent development of single-molecule techniques. In particular, we compare the 
results obtained with two main techniques: single protein force measurements with optical tweezers 
and single-molecule fluorescence in studies on the same protein (RNase H). This allows us to point 
out some controversial issues such as the nature of the denatured and intermediate states and possible 
folding pathways. After reviewing the various numerical simulation techniques, we show that on- 
lattice protein-like models can help to understand many controversial issues. 
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1. INTRODUCTION 

Electrostatic forces, Van der Waals interactions, hydrogen bonds and entropic forces are 
the main elementary interactions that govern thermodynamics and kinetics of molecular 
interactions. In solution, electrostatic forces are mainly screened except at very short 
distances, typically on the order of the Angstrom. At these distances, chemical bonds 
of energies a few hundreds times larger than the thermal energy tend to form 
at physiological temperatures -ks is the Boltzmann constant and T the temperature 
of the solvent. At physiological temperatures, the free energy of the hydrogen bonds 
involved in the formation of the protein secondary structures (a-helix, /3 -sheet) is 
around 2 x [iFin021 . Also, the Van der Waals potentials that are responsible for 
the protein tertiary interactions are on the order of kBT. Thus, in proteins (but also in 
nucleic acids), the thermal energy is comparable to the free energy of formation of non- 
covalent interactions. This leads to opposite effects. On one hand, it means that thermal 
agitation is the main source of intrinsic noise for biological processes [Mca99J. On the 
other hand, it suggests that thermal energy may be used as an energy source to trigger 
confor mational changes and therefore induce mechanical work at the molecular level 



[|Rit03|] . However, in order to carry out specific tasks in a highly fluctuating environment, 
evolution, through natural selection, has favoured the formation of compact biological 
structures (DNA, RNA, protein) that are stabilized by multiple non-covalent bonds. 
RNAs and proteins are small enough to be activated by a small amount of energy 
available from ATP hydrolysis and, at the same time, stable enough to be biologically 
functional. DNA is a very long charged polymer but only a few number of base pairs 



are involved during transcription or replication processes. Furthermore, proteins, such 
as DNA polymerases or helicases, act locally on the DNA. 

Proteins are ubiquit ous mo lecules with a large variety of functions (regulatory, en- 
zymatic, structural,...) iPet04l] . Regulatory proteins are involved in gene regulation pro- 
cesses, structural proteins (microtubules, actin filaments,....) give mechanical rigidity to 
the cell, transmembrane proteins regulate ion and water transport through membranes, 
etc... Proteins do not usually work alone. In some cases, a multiplex of several individual 
proteins participate in a common task, such as helicases and DNA polymerase proteins 
that coordinate their action during replication. In other cases, proteins are subunits of 
large molecular complexes such as the ribosome that consists of a patchwork of RNA 
and protein subunits. 

During cell activity, proteins are continuously synthesized -and destroyed by pro- 
tease proteins. Constitutive amino acids are transported by the transfer RNA and the 
ribosomes synthesize polypeptide sequences by matching the genetic code of the mes- 
senger RNA. Proteins have the remarkable ability to fold upon a nativ e structu re. This 
propensity was demonstrated in in vitro experiments by Anfinsen et. al [|Anf73ll in a de- 
naturation/renaturation experiment of the Ribonuclease A protein in presence of urea. 
Subsequently, it has become clear that this is a general property of pr oteins s ince many 
experiments on different proteins have led to the same conclusion [|Dob98|1 . The fast 
folding property is crucial since a protein becomes active only by adopting a specific 
thermodynamically stable structure (and in many cases by further forming specific com- 
plexes with small ligands). Furthermore, the diversity in protein functions is related to 
the diversity of protein structures. Folding of large proteins is helped by specialized 
biological machines, the so-called chaperons (GroEL-GroES, DnaK-DnaJ,...). 

The fast folding property is not trivial. A random sequence of amino acids (and in 
extreme cases a single amino acid mutation of a good folder) leads to a polypeptid e 
chain that behaves as a random coil without any specific structure [|Dav94l ICre92l] . 
To understand how proteins fold, different theoretical p ictures have been proposed 
during the past twenty years [Fin02, Dil97, OnuM lThi05ll . Interestingly, recent single- 
molecule experiments [BusOSj, iBen96, Bas97] that allow to investigate the biochemical 
processes at the molecular level [Bus03], in conjunction with increasingly powerful 
simulations, have refuted some of the theories and sharpened the big picture. Such a 
symbiosis between experiments, theory and numeri cal simu lations have led to a better 
understanding about how biological machines work liBusOSll . For instance, one can now 
think o f models that predict the native structure of a protein from its primary sequence 
[|Bra03n . At a different level, biophysicists are able to observe in real time the action of 
single p roteins acting on DNA, such as Gyrase, a protein that relaxes DNA supercoils 
[IGorOg], and study it under different conditions (temperature, pH, tension, torsion,...). 

This article is a short review about the folding/unfolding of small proteins. We first 
discuss the theoretical ideas that are nowadays used to tackle this problem. Next, we deal 
with the most recent experimental techniques that have provided important information 
about the folding mechanism of different proteins. We finish by reviewing the numerical 
techniques that are commonly used to investigate structural properties of the proteins 
during the folding transition. Our main goal is to present the basic notions necessary 
to understand the physics of the folding problem. The reader interested in a deeper 
understanding will find more detailed discussion in the proposed references of each 



section. 

In section [2] , we discuss the energy landscape picture, a useful scheme that pro- 
vides an intuitive idea of the folding propensity [Dil97, Onu97] bu t also th at has led to 
quantitative tools useful to predict native and intermediate states [|Onu04|] . We further 
illustrate this approach by describing two simple models that show a protein-like be- 
haviour. In section [3l we describe the main single-molecule techniques used to investi- 
gate individual proteins. We focus our discussion in underlining the differences between 
single-molecule force and single-molecule fluorescence experiments. To this end, we 
compare studies that have been carried out with the protein RNase H. In section HI we 
review different numerical techniques such as molecular dynamics. We explore the use 
of coarse-grained models and give more details about generic lattice models that share 
protein-like properties. These models have the advantage of not being time-consuming, 
and allow to tackle general properties expected in single-domain protein folding. In par- 
ticular, we address the questions of force-induced dynamics of a single-domain protein. 



2. THE FREE ENERGY LANDSCAPE PICTURE 

2.1. The Levinthal paradox 

The structure of the native state of a protein is hierarchical. In the lowest level of 
description, a protein is described as a sequence of amino acids (residues) linked by 
peptide bonds. There are twenty different types of amino acids corresponding to different 
side chain groups -see Fig. \T\ The residue sequence is called the primary structure. 
The formation of nearby hydrogen bonds between the amides and the carboxyl groups 
(Fig.d]) stabilizes the secondary structures mainly consisting of a-helices and /3-sheets. 
The secondary structures are further stabilized by the tertiary interactions that are either 
hydrophobic interactions or disulfide bonds. Hydrophobicity results from the exposure 
of hydrophilic side chains to the solvent leading to the condensation of polar residues 
inside the core of the protein. 

Each individual peptide group can have two conformations (Fig. [T])- For an M-residue 
chain, one then roughly expects 2^ possible side chain configurations. Assuming that the 
minimal timescale for a stereoisomeric conformational change is about one picosecond, 
then the total time required for visiting all the configurations should be ~ 2^ps ~ 
lO^'^years for a 100-residues protein [Fin02]. This crude approximation shows that 
the fol ding pr ocess can not consist of a random search in the protein configurational 



space [|Lev68|] . On the contrary, the energy landscape, i.e. the energy surface as a 
function of the configurational parameters (the degrees of freedom) -see Fig. [2]- , is 
biased toward the native structure as depicted in Fig. [21 Within the ideal picture of 
Fig. [21 at sufficiently "low temper atures" (when kgT is on the order of the formation 



energy of a native contact f Zwa92l1 ) the free energy landscape is biased by the energy 



lergy 

gradient leading to downhill motion and collapse towards the native structure fDi lgvL 
[Onu04, Onu97J. This situation corresponds to a perfect funnelled landscape [Qn u04ll . 
The underly ing mec hanism that leads to such a smooth landscape is referred as "minimal 



frustration" Il0nu97[l . A minimally frustrated structure is a structure for which the intra- 



molecular interactions are not in conflict with each other leading to a smooth landscape 



FIGURE 1. A protein is a chain of amino acids linked by peptide bonds (the peptide units are outlined 
by he parallelograms). The side-chain (outlined in green) defines the residue (amino acid + side-chain). 
Twenty types of side-chain exist, the most simple being an atom of hydrogen that is called glycine. In 
this case, there is no j8 -carbon. The peptide units are planar, due to the sp^ hybridization type of the N-C 
bond. Different % angles correspond to different conformations of the side-chain. Two conformations {cis 
and trans) are possible for the peptide unit, depending on the positions of the oxygen and the hydrogen at 
the tops of the parallelogram. Carbon, oxygen, nitrogen and hydrogen atoms are respectively represented 
in grey, red, blue and white. The backbone structure is highlighted in black. 




N 



FIGURE 2. Artistic cartoon of a perfect funnelled landscape. The vertical axis counts for th e energy 
and the horizontal plane for the degrees of freedom of the polypeptide chain. Taken from llDil97ll . 



as in Fig |2l The concept of a funnel is not only qualitative but also quantitative. The 
simplest way to desig n a perf ect funnel is by considering interactions that only stabilize 
the native structure [|Onu04 1. By following this strategy and using a coarse-grained 
description of proteins (e.g. the Go model (see section 14.11) ). e xcellent predictions of 
native structures and even intermediate states have been observed [ Onu04|] . However, the 
perfect funnelled landscape is not so general. A rough energy landscape with many local 
minima and saddles corresponding to configurations with various degrees of stability, is 
more appropriate. Within this picture, misfolded beha viour re sults from the competition 
between local minima that are close to the native state llOnu97l]. For a d etailed discussion 
about the energy landscape, see the review by Onuchic et. al [|Onu97ll . 
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FIGURE 3. Dependence of the entropy in a random energy protein. In the pure random energy model 
(without the native state at energy En), the mean value of the energy E and the entropy S are given 
by the coordinates at which the derivative of S{E) is equal to 1/T. In the composite model (i.e. with 
the native state), the free energy equality between the native state and the denatured states implies 
^/Tn = S*/{E* ~En). 7}v is the transition temperature where the native state and the denatured state 
are equal hkely. A is the energy gap. 



2.7.7. Mixing stochasticity and determinism 

The energy landscape picture, by definition, leads to non-specific folding pathways 
from the many denatured (i.e. non native) conformations to the folded native state. This 
scheme has been opposed for a long time to the very first scenarios aiming at explaining 
the folding property. According to the latter, the folding process is a specific mechanism 
whose dynamics is sequential, which leads to a unique folding pathway [Fin02] , by 
oppos ition to the stochastic nature of the energy landscape approach [|Dil97L IOnu97l 



Fin02|] . These two view points are not contradictory but rath er describe mechanisms at 



different levels. For instance, Lazaridis and Karplus [|Laz97ll have studied 24 unfolding 
trajectories of a small protein (chymotrypsin inhibitor 2), using molecular dynamics. 
They have observed large statistical fluctuations in the gyration radius of the successive 
structures during the unfolding process, in agreement with the energy landscape picture. 
On the other hand, some specific events, such as the destruction of tertiary contacts, were 
found to be specifically ordered in time [ , Laz97,1 . 



2.2. Thermodynamics 

In most cases, small globular proteins fold following an all-or-none process, just as do 
small RNA hairpins. The origin of this cooperative effect lies in the fact that the native 
state has a very low entropy. Thus, the transition from the denatured state (with high 
entropy) to the native state is generally accompanied by an entropy jump or, equivalently. 



a peak in the specific heat as observed in bulk denaturation experiments [|Fin02|l . In 



the following, we discuss two simple models describing the transition between a high 
entropy phase and a very low entropy native phase. 
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FIGURE 4. The Zwanzig picture, n, is the number of non-native contacts. Left: the potential E{ns) 
is given by £'(n.s) — leus when there are non-native contacts (n^ > 0). At the native state, £(0) = — eo- 
Therefore, the energy gap between the native state and the lowest denatured states is equal to Eq. Right: 
the entropy as a function of «i/M for large M. In this Umit, S{x = ris/M) /M = {x— l)log(l — x) — xlogx. 



2.2.7. The entropy crisis avoided. 

In the glass phenomenology, it has been hypothesized that there is a finite temperature 
at whic h the configurational entropy ([total entropy] — [vibratio nal entr opy]) vanishes 
[|Edi 96'l ■ This has been called "the entropy crisis" by Kauzmann ['Kau48]. The simplest 
model describing the entropy crisis is the random energy model (REM) [DerSO]. In this 
model, the entropy is a quadratic function of the energy that vanishes at an energy Eq, i.e. 
there is no state with an energy below Eq. At equilibrium, the free energy corresponds 
to the point in the entropy curve, S{E), at which its tangent is equal to l/T -see Fig. 
[3l As a consequence, as T Tq where Tq corresponds to the energy Eq, the entropy 
continuously vanishes. The point dXE = Eq defines the glass transition. By incorporating 
into this model a native state wit h an ener gy Ei^ — Eq — the so-called energy gap), 

one gets a first-order transition I Onu97 l]. between a high entropy state and the native 



state, at a temperature = {E* —Epj)/S*. At T^, the free energy of the denatured state, 
F* = E* — Tf^S*, is equal to the native one. Fat = Ef^. It can b e shovyn that the glas s 
transition is avoided if the energy gap is much larger than Eq/M [ Bry89L[Onu04ilFin02 |. 



M being the number of residues. The transition then becomes first order. 



2.2.2. Funnel-driven transition 

The energy funnel is akin to the minimal frustration property. The simplest model 
exhibiting a perfect funnelled landscape is the Zwanzig model [Zwa95]. It can be 
thought of as a spin model where the energy of a given spin configuration ^ = {si...sm} 
(we consider a set of M spins) reads E = £Y,i\si — sf\/2 — £q5{^ — where = 
{s^ ...s^} is the native configuration. The parameters e and £o are positive energies 
related to the gradient of the funnel and the native gap A respectively -see Fig. IH The 
energy of this system can be explicitly written as a funct ion of the number n^. of spins that 
differ from the spins in the native configuration llZwa95ll . We will call M — ns the number 



of native contacts. Let us now consider a single-spin dynamics with Metropolis rules. In 
this case, at any time there are only two kinds of elementary moves: a spin-flip can lead 
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FIGURE 5. The energy landscape picture. In general, the energy landscape of a protein is rugged and 
funnelled, i.e. with an overall gradient that is oriented toward the unique native state. The landscape is 
stratified according to the energy of the configuration, or according to the per centage of native contacts. 
Misfolded states are the local minima closed to the native state. Adapted from llOnu97ll . 



to a new native contact (n., — > — 1) or to a new non-native contact (ris ns + 1). Since 
there is no interaction between the spins, there is no conflict between the interactions, 
which means that the probability to have a native contact at a site / does not depend on 
the configuration of the other spins. A set of non-interacting constituents that feel a time- 
independent local potential is therefore the simplest example of a minimally frustrated 
system since there is no frustration at all. 

One can write a master equation for the probability density of the nu mber of contacts 
Hs at a given time, where n., represents a reaction coordinate [|Zwa95l] . The thermody- 
namic potential Eijis), that reflects the funnel- shape, is linear in and has a gap at 
^5=0 (Fig. S]). As occurs in any folding transition, there is a competition between the 
entropy, that favours denatured states (non-native contacts), and the potential energy 
E{ns) that biases the system towards the native structure. In the limit of large number of 
spins, one finds a temperatur e transit ion, T)v = e, at which the probability of being in the 
folded/unfolded state is 1 /2 [IZwa95l] . 



2.2.3. The potential of mean force 

Taking into account the ruggedness of the energy surface, the funnelled shape of the 
potential energy landscape is usually represented as in Fig.[5][Onu97]. However, the/ree 
energy landscape is more suitable to discuss a possible thermodynamic transition. Con- 
trary to bulk experiments, in which measurements lead to ensemble averaged quantities, 
single-molecule experiments allow to compute the free energy as a function of reaction 
coordinates such as the molecular extension, a quantity that is related to the number of 
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FIGURE 6. The free energy landscape, or equivalently the potential of mean force. Two scenarios are 
usually observed. Left: the transition is first order with a coexistence phase between the native state and 
a denatured state. Right; The transition is continuous. The free energy projection always shows a single 
well that drifts toward the native state as the folding conditions become more appropriate (bottom panels). 
The * indicate the denatured states and the native states. 



native contacts. This is usually called the potential of mean force. In the following, we 
discuss situations where the free energy is projected along a single reaction coordinate. 
Nevertheless, it must be stressed that many aspects of the folding kinetics can not be 
understood without considering more than one reaction coordinate (see e.g. [LeeOO]). 

The free energy landscape approach calls for two general situations (Fig. (6]) that have 
been experimentally observed. On one hand, some single-domain protein experiments 
have revealed a continuous phase transition between denatured states and the native state 
[lMun041 . In this case, at any temperature, the free energy landscape is composed of a 
single well. The minimum of the well drifts towards the native structure as one lowers the 
temperature or decreases the denaturant concentration. On the other hand, most of the 
single-molecule (and bulk) e xperiments involving single-domain proteins have revealed 
a first order transition IIFinOZ , BakO O]: the free energy profile consists of two wells with 
minima corresponding to the denatured and the native states -see Fig.[6l As in any first- 
order transition, the system goes from a denatured state to the native state by passing 
through a coexistence phase. In bulk experiments, this leads to the presence of proteins 
that are denatured and proteins that are in a native state. From the single-protein point 
of view, this suggests cooperative switches between the native and the denatured states 
as reported in Fig. Ul 



2.3. Kinetics 



The energy landscape represents a useful picture to understand the existence of mis- 
folded structures and, more generally, the folding kinetics of a single-molecule. By con- 
sidering the overdamped motion of a particle along a potential of mean force (free en- 
ergy projection), one implicitly makes the assumption t hat all t he degrees of freedom 
orthogonal to the reaction coordinate locally equilibrate This may not be true 

in specific non-e quilibriu m conditions, such as low temperatures where proteins show 
glassy behaviour iOnu97|| . 




FIGURE 7. Example of cooperative transitions observed in a single-molecule experiment llTan03ll . Left: 
the vertical axis represents the FRET efficiency (see section ITTl i that reflects the state of the biomolecule 
(here an RNA molecule hairpin ribozyme shcematically represented on the right part of the figure). One 
can see that the molecule switches between the native state (upper configuration on the right) and the 
denatured state (lower configuration). Taken from iHatl . 



2.3.1. Two-states and downhill kinetic scenarios 

Kramers theory allows to derive dynamical properties related to the diffusion motion 
of a particle along a one-dimensional landscape [ ZwaOl ]. In particular, the mean first- 
passage time between the denatured and the native states can be computed to extract the 
effective free energy barriers. A two- states description of the Kramers' problem models 
the dynamics in terms of activated events across a free energy barrier and represents the 
simplest description of a cooperative all-or-none transition. This approach is potentially 
useful to understand si ngle-mo lecule force experiments, e.g. in the force unfolding of 
single RNA molecules nRit02|l . When the position of the transition state moves along 



the reaction coordinate by changing the external conditions (temper ature, de naturant 



concentration, stretching force), known as the Hammond behavi our llHam55n . an ex- 
tended two-states description with a mobile barrier can be applied llMan06l] . 



The existence of free energy barriers that make the transition all-or-none is usually 
attributed to the asynchronous compensation between energy gain and entropy loss 
[Onu97] . However, continuous transitions have been also observed in recent experiments 
[Gar02]. These transitions can then be thought as a limiting case of the two-states model 
where the free energy barrier becomes c omparab l e to k^ T. It is then more convenient 
to see the folding as a downhill process [|Mun04l |Gar02]. Notice that the ideal funnel 



picture of Fig. [2] actually suggests a compensation between entropy (given by the radius 
of the funnel) and energy (given by the depth of the funnel). 

The two-states transition between either the native state and a random coil (with 
no native contacts) or the native state and a molten-globule structure (with numerous 
native contacts) has been for long time a well accepted scenario for single-domain 
proteins [Fin02]. H oweve r, this has been disputed in recent numerical studies on the 



lyzozyme (IHEL) iFit04l1 . This is a single-domain protein known to exhibit at room 
temperature a first order transition between the native state and a pure random coil as 
the concentration of denaturant (guanidine dihydrochloryde) is increased [Tan66.1 . In 
fact, it has been shown that even in the presence of many native contacts (more than 



FIGURE 8. The RNase H protein structure. The colourful stars represent the dyes that are chemically 
attached to the protein. These dyes are used in the florescence techniques, namely the FRET measurement 
-see text, r, th e distan ce between the dyes, is directly related to the molecular extension of the protein. 
Adapted from ICecOSll . 



90%) the gyration rad ius an d the end-to-end distance are well described by the Gaussian 
random coil model ['Fit04']. Therefore, standard bulk experiments may not provide 
enough information to distinguish a random coil state from a native-like state. In the 
next section, we describe single-molecule techniques that might resolve this controversy 
by addressing new interesting questions. 

3. SINGLE-MOLECULE EXPERIMENTS 

In this section, we review the principal techniques used to investigate proteins in single- 
molecule experiments. We focus our discussion on two of them: fluorescence spec- 
troscopy and force measurements. We compare the results obtained with these tech- 
niques in the RNase H protein and discuss whether force may induce folding pathways 
different than those of thermal folding. 



3.1. Fluorescence techniques 

Three-dimensional native structures can be determined in solution by nuclear mag- 
netic resonance (NMR) spectroscopy or in crystal forming proteins by X-ray crystal- 
lography. For small globular proteins, the two measurements give generally the same 
result, showing that the native structure is a highly compact structure in solution. Such 
techniques are inappropriate for studying the structure of the transition state and the 
denatured states. Indeed, it is impossible to crystallize fluctuating states and NMR mea- 
surements average out conformational fluctuations. Nevertheless, some bulk techniques, 
e.g. small-angle X-ray and neutron scattering, have provided precious information about 
quantities such as the radius of gyration [Mil02]. In particular, it has been shown that 
random coils are not the most general denatured state [Mil02], even at high denaturant 
concentrations. Recently, these results have been unambiguously confirmed by using 
single-molecule fluorescence techniques. 



Fluorescence techniques are based on the so-called Forster resonant energy transfer 
(FRET). A green fluorescent donor dye and a red fluorescent acceptor are chemically 
attached to the end residues of the protein (Fig. (H). The donor is excited by a well- 
tuned laser and further relaxes by emitting a fluorescent light that can be detected by 
a spectrophotometer. The acceptor is chosen such that its absorption spectrum overlaps 
the emission spectrum of the donor. As a consequence, a non-radiative energy transfer 
between the chromophores may decrease the intensity of the donor by enhancing the 
emission of the acceptor. The (FRET) efficiency of energy transfer between acceptor 
and donor depends on their distance r, and hence on the protein extension, through the 
simple relation 

^ ^ 1 + (r/Ro)' 

where Rq is a characteristic parameter of the pair of dyes. On the other hand, the 
efficiency E can be directly related to the emitted intensities by the dyes: 

- (2) 



Id + Ia 



where Id and Ia are the intensities emitted by the donor and the acceptor. 

As a result, a quantitative spectral detection of the dyes gives information about 
the conformation of the protein. One may even think of placing the dyes at different 
locations in the protein to get further structural information. In this spirit, this technique 
has been recently used to investigate specific conformational changes during biological 
processes. For i nstance , it has been used to follow the different steps of protein synthesis 



in the ribosome IIBla04ll . This is very important to understand the mechanism responsible 



for the so exclusive codon/anticodon recognition by the transfer RNAs. 



3.1.1. The RNase H protein 

Let us now focus on folding studies in a small single-domain protein, the 155-residue 
RNase H protein. The native structure of this protein is well known and is shown in 
Fig. [8l Under appropriate folding conditions, several (bulk) studies have shown that the 
folding is preceded by a fast collapse to a compact structure presumably stabilized by a 
central nucleus [Ras99, Bal99 ]. 

Nienhaus and co-workers have used the above flu orescence techn ique to investigate 



several structural properties of the RNase H protein [|Kuz05LlKuz06|] . To determine the 
spectral properties of the dyes, they fix an ensemble of proteins on a glass surface. A 
FRET histogram is obtained by "counting" the number of proteins with efficiency E 
(see Eq.[T]). By varying the concentration of denaturant, e.g. the guanidine dihydrochlo- 
ride, they have monitored the cooperative transition between the native and the denatured 
states. From these measurements, it is then possible to extract the corresponding poten- 
tial of mean force (free energy landscape) along a reaction coordinate that is related to 
the compactness of the protein -see Fig. [9l This coordinate actually characterizes the 
propensity of the molecule to let the solvent enter. Notice that these curves could be 
the curves of any single-domain protein. Interestingly, the folding free energy changes 




FIGURE 9. Free energy profiles of the RNase H as a function of the denaturant concentration [GdmCl]. 
Upper panel: the thermodynamically stable state is the native state. Bottom panel: the thermodynamically 
stable state is a denatured state. The abscissa is the so-called cooperativity parameter and is related to 
the propensity of the state to let the solvent enter into the molecule. j,:=i w is a set of denatured states 
structurally close to the native state, the closest being Ui. Taken from llKuz06ll . 



as one varies the concentration of denaturant. This raises several questions: to what 
extent the denatured state at low denaturant concentration is different from the dena- 
tured state at high concentration? Is the transition between the high-denaturant state and 
the low-denaturant state of the same type as the continuous transition discussed above? 
Does the high concentration denaturant state have a residual structure reminiscent of the 
native state? Fluorescent studies by the Nienhaus group have shown that even at high 
denaturant conc entration, the denatured state was composed of non-random structures 
[iKuzOSLlKuzdl . This study, in conjunction with the numerical simulations of the folding 
of lyzozyme (IHEL) [|Fit04tl . cast serious doubts about the true nature of the denatured 
state, an issue that has been an experimental challenge for a long time. Moreover, it is 
also a difficult problem from the point of view of numerical simulations because of the 
huge number of accessib le confi gurations. 

The RNase H results [KuzOSj] suggest the existence at low denaturant concentration 
of a well-defined compact structure different from the native state. As discussed above, 
this structure was expected from earlier stop-flow kinetic experiments in which RNase 
H often showed the accumulation of compact structures during the dead-time of the 
measurement. Interestingly, force experiments applied to the same molecule have also 
shown the existence of a well-defined intermediate state that coexists with both the 
denatured and the native states tCecOSfl. 



3.2. Force measurements 



Force measu rements on a single molecule have been first realized on a double- 
stranded DNA nBus03n . A fluid flow and a magnet were used to stretch the molecule 
that was attached to micron- size beads. The measurement of the molecular extension 



bead 
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FIGURE 10. Setup of the force measurement in the single-protein RNase H experiment. DNA linkers 
between the beads and the protein are inserted in order to be able to manipulate the protein. The DNA 
linkers are chemically attached to the end of the proteins via the insertion of a cystein side-chain. The 
bead at the top of the pipette is held fixed by air suction and the other bead is trapped in the optical well. 



of DNA has then r evealed unexpected mechanical properties, such as th e overstretching 
transition [|Bus03n . Subsequently, different studies have been realized iBus03l . lBen96n 
in order to investigate the behaviour of DNA under torsional strain (using magnetic 
beads that can be rotated by m agnets) llStrOOl. IStr96l] . the DNA and RNA unzipping 
process (using optical twee zers) ||Boc02| . ICoc03Ii . the DNA packag ing problem llSmiOlll . 
DNA/protein interactions or DNA condensation rRit06l] . As a consequence, 

single-molecule force experiments have contributed a lot toward ou r under standing of 
the cell machinery. Single-molecule force measurements (on RNA) flColOSQ have been 
also used to test non-equilibrium theories in statistical physics and to recover folding 
free energies in RNA molecules. In this spirit, biomolecules appear to be ideal systems 
to explore the thermodynami c behav iour of small systems and to test non-equilibrium 
theories in statistical physics jBusOSl] . 

Recent nano-m anipul ation of single protein molecules using the atomic force mi- 
croscope (AFM) llFis99 1 have provi ded di r ect evi d .ence f or sequential unfolding of 
individual domains upon stretching [iFisOnl . lKel9l lRie9l ISchOSL However, optical 
tweezers are more appropriate to study the unfolding/folding dynamics of small single- 
proteins (and small RNAs). In fact, the folding free energies of such biomolecules are 
on the order of lOO^gT at room temperature. Considering a typical gain in extension 
of Ajc ~ 10 — lOnm, between the native state and a stretched state, a mechanical energy 
/Ax ~ looker is provided by a stretchi ng forc e on the order of lOpN, which is in the 
ideal working range of optical tweezers [|Lan03|] . In con trast, the AFM technique is use- 
ful to investigate forces above a few tens of pN [|Fis99ll but can not reach forces ~ pN 
mainly because of the high spring constant of the cantilever. 

Typical optical tweezers experiments use micron-sized glass chambers filled with 
water and two beads. The protein is chemically labelled at its end and polystyrene beads 
are chemically coated to stick to the ends of the labelled molecule. Because proteins 
are too small to be manipulated with micro-sized beads, a tether consisting of a double 
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FIGURE 11. Typical force-extension curve (FEC) during the unfolding/refolding ramp force protocol 
of a single-domain protein. The undolding of the protein corresponds to the extension jump (in red). The 
rest of the curve is well described by a worm-like chain model that models the extension of the linkers 
when the protein is still folded, and the extension of the (linkers + unfolded protein) when the protein is 
unfolded. 



Stranded DNA is inserted between the beads and the molecule that acts as a polymer 
spacer -see Fig. \W\ This prevents Van der Waals forces between the beads and the 
protein and allows a direct manipulation of the protein. One bead is then held fixed by 
air suction on the tip of a glass micro-pipette, the other is trapped in the focus of a laser 
beam. When the bead deviates from the focus a restoring force acts upon the bead, the 
principle being the same by which a dielectric substance inside a capacitor is drawn 
inwards by the action of the electric field. To a good approximation, the trap potential 
is harmonic. Thus, knowing the trap stiffness, it is possible to apply mechanical force 
(by moving the bead) and to observe in real-time the force-extension curves (FEC). In 
the FECs, the force acting on the molecule is represented as a function of the end-to-end 
distance between the two beads. The cooperative opening of the proteins is characterized 
by a jump in the extension of the molecule -see Fig. [TTJ By studying the stochastic 
properties of the FECs, one is able to recover the distance from the native state to 
the transition state and map the free energy landscape as a function of the molecula r 
extension. The folding and the unfolding rates can also be determined [iManOdlSchOll . 



3.2.1. The RN as e H protein 

In the case of the RNase H protein, Cecconi et al llCecOSh have shown that mechanical 
forces can stabilize an intermediate state. We use the word "stabilize" since the interme- 
diate state corresponds to a local minimum of the free energy landscape projected along 
the end-to-end distance (Fig. [T3l) that is well separated from the unfolded state and the 
native state by all-or-none transitions (Fig. [T2l) . At constant force, three regimes can be 
distinguished depending on the value of the force: at high force, the molecule is fully 
stretched and no native residual contacts are present; at low force, the protein is in its 
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FIGURE 12. Extension trace of the RNase H protein at constant force (« 5 .5 pN). In the first part, we see 
successive all-or-none transitions between the intermediate state and the unfolded state (fully stretched). 
Then, a transition occurs between the intermediate state and the native state showing that the intermediate 
state is on-pathway. Figure taken from llCecOSll . 



native compact state; in-between, there is an intermediate state with a partial number 
of native contacts that are formed. Three states, instead of two, coexist: the stretched, 
the native and the intermediate compact states (see Fig. [T3l) . A statistical analysis of 
the breakage force and measurement of the rip extensions have led to an extrapolated 
zero-force intermediate free energy that correlates well with that of the early compact 
structure that forms in bulk experiment |lRas99(]. 



3.3. Comparing force and FRET measurements 



A comparison between the folding/unfolding study of RNase H with and without 
force raises interesting questions relative to the structure of proteins: 1) Under which 
conditions do we expect that the early molten-globule state that forms at zero force is the 
intermediate state stabilized by mechanical force? 2) FRE T measu rements have revealed 
a hierarchical structure of RNase H in the denatured state ||Kuz06I1 . Is the stabilization of 
the intermediate state related to this observation? 3) More generally, is the stabilization 
of an intermediate state a signature of a specific folding mechanisr n? Such questions can 
be actually addressed in numerical simulations of simpler models nJunOql . 

The force measurements in RNase H also raise questions about the on/off pathway 
nature of the intermediate states, an issue that we discuss in the next paragraph. 



3.4. Probing the nature of the intermediate states 

Let us consider a system with a free energy landscape showing three well- separated 
minima -see Fig. [131 One might wonder whether a diffusive dynamics along this profile 
fairly reproduces the observed dynamical behaviour in the single-molecule experiment 
(Fig.[T2l). In such case, by starting from any state in the intermediate region and prevent- 
ing the system from going to the stretched region, the molecule should be able to fold to 
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FIGURE 13. Free energy (G) in RNase H protein projected along its molecular extension. Three re- 
gions, that are delimited by the signs can be defined: the native region where the minimum corresponds 
to the native state (N), the intermediate region where the minimum corresponds to the intermediate state 
(I) and the stretched region where the minimum corresponds to the unfolded state (U). 



the native state. Misfolded (off -pathway) states are those that can not lead to the native 
state without unfolding back to the stretched states. 

In the RHase H force study, the extension t race o f Fig. [12] suggests that folding 
indeed takes place via the intermediate states [ICecOSll . However, we can not discard 
additional states lying at the same coordinate than the intermediate state and that can 
not lead to the native state without unfolding back to the stretched state. We can propose 
an experimental protocol to quantify the fraction of off -pathway states with respect to 
on-pathway states. Each time the system jumps to the intermediate state, we suddenly 
relax the force to a lower value. On-pathway states should quickly lead to the native 
state without unfolding back to the stretched state. For off-pathway states however, it is 
expected a first unfolding event to the stretched state and, most likely, an extremely large 
folding time as compared to the typical on-pathway folding time. We give a numerical 
example of such a protocol in the next section. 



4. NUMERICAL SIMULATIONS 

From an experimental point of view, it is still a challenge to get atomic structural in- 
formation of intermediate states, transition states (corresponding to the maxima of the 
projected free energy landscape), or denatured states. One could think of a fluorescence 
technique using dyes attached to different residues of the proteins. However, the pres- 
ence of the chromophores inside the molecule is likely to impede the correct folding of 
the molecule or to modify the real structure of the expanded states. So far, the best way 
to characterize non-native states has been to resort to numerical simulations. The latter 
can be divided into three classes: 

1. Molecular dynamics. It takes into account all (or almost) the atomic details of the 



molecule and the solvent can be explicitly or implicitly treated [ Kar02|l . The folding 



pathways are determined by simulating trajectories in quasi reversible conditions. 
The technique is time limited because only nanosecond long unfolding trajecto- 
ries can be obtained whereas folding of real proteins occurs mostly in microsec- 
ond timescales. High temperatures or mechanical forces are then usually used to 



accelerate the unfolding trajectories [' Kar02l] . The main argument is that different 
conditions induce different timescales but not different mechanisms. Current im- 
provements in this field have been achieved thanks to the develo pment of more and 
more accurate interatomic potentials in different environments [|Cor95l] . 
Interestingly, the original technique has been also adapted to simulate a large set 
of short trajectories (starting from random configurations) of a designed small 
protein (23 residues) at room temperature LSno02]. A non-negligible amount of 
very fast folding trajectories (~ 20ns) has been observed whereas in experiments 
the mean folding time is on the order of microseconds. This shows, as expected 
for two-state cooperative proteins, that the folding step is very short but the whole 
folding mechanism is slowed down due to the presence of many possible denatured 
configurations. As mentioned in section 12.3. 1[ this comes from an asynchronous 
compensation between entropy and energy. 

2. Coarse-grained models. Within this scheme, one reduces the all-atom descrip- 
tion to a mesoscopic description by neglecting details of the polypeptide chain 
llBak0nLlzho99LIVei97l] . The parameters of the mesoscopic description are obtained 
by a close comparison with experiments. Different levels of simplification are usu- 
ally taken into account. Perfect funnel models including only interactions that sta- 
bilize the nati ve struct ure have led to excellent predictions of native and transition 
states [BakOo UOnuOil . Less restrictive Go models lead to a more refined statistical 
description of folding trajectories [|Cle03 . At the end of this section, we 
describe details of such simulations and discuss the issue in the presence of me- 
chanical forces. Notice that modelling of the solvent is also essential to understand 
protein folding. Within this scope. Go-like potential including a (de) solvation po- 
tential can be ta ken into account to model the expulsion of water molecules from 
the protein core fOnuO?]. 

3. Protein-like models. The purpose of these models is not to study the folding mech- 
anism of some specific proteins but rather to give general insights about the fold- 
ing dynamics. They are used to determine whether the dynamics is related to the 
(hetero-)polymeric properties of proteins, such as the native state geometry or the 
contour length of the chain. They are also useful to investigate the folding behaviour 
in presence of specific external conditions (temperature, denaturant concentration, 
stretching force...). Although there m ay have qualitative differences between on- 
lattice and off-lattice models (see e.g. f Pan98ll ). most of the studies have been done 
with models defined on a lattice, the main reason being the possibility to simu- 
late large molecules during long times. In the following, we review some of these 
models and show how to incorporate mechanical forces. In particular, we show that 
these models can be useful to tackle the problem about the on/off pathway states. 



4.1. Protein-like behaviour of simple models 



4.1.1. Hydrophobic-Polar models 

Heteropolymers on a lattice with simple hydrophobic-polar interactions between non- 
adjacent m onomer s are the simplest models that show a protein-like behaviour. In the 
HP model [ICha89l . ICha94l] . a diblock copolymer chain composed of hydrophobic {H) 



and hydrophilic, equivalently polar, {P) monomers is considered on a square lattice. 
Only the interactions HH are energetically favourable, the so-called Go interaction. 
Specific sequences (HPPH...) then lead to protein-like behaviour and have been used to 
exhaustively explore the underlying energy landscape [Cha94]. Interestingly, the folding 
mechanism has been shown to be reminiscent of small single-proteins. Indeed, under 
appropriate folding conditions, the extended chain quickly condensate into a rich HH- 
bonds structure. 

Since non-native HH bonds are present, the molecule further needs to break HH 
bonds to get closer to the native state. This stage is similar to the exploration of a non- 
native compact structure set that precedes the fast downhill step. Such a process actually 
goes accompanied with an expansion of the structure in order to allow local confor- 
mational changes of the polymer, a behaviour that has be en expe rimentally observed 



[Cha94] . More quantitatively, a recent study of this model [|Kac06l1 has pointed out that 
the folding rates may not be correlated to the thermodynamic properties of the molecule, 
such as the value of the energy gap and the structure of the native state. It rather suggests 
that the folding rates are well correlated with the number of local energy minima, i.e. 
the former decreases as th e lat ter increases. These results are in good agreements with 
some recent experiments [|Sca04ll but disagree with other experiments tha t have shown 



a correlation between the native structural properties and the folding rates [|Pla98|] . 

Notice that this kind of models do not present an intrinsic hierarchical structure (pri- 
mary, secondary and tertiary) as in proteins. They should be rather thought of as a rough 
modelling of a mixture of secondary and tertiary contacts. This does not belittle the use 
of these studies since it is known that the secondary structures generally form before or 
meanwhile the tertiary structure does. Indeed, in general terms, it is believed that there 
are three kinds of possible folding mechanisms: i) the hierarchical mechanism where 
the secondary structures form before tertiary contacts, ii) the nucleation-condensation 
mechanism where a set of secondary contacts initiates the growth of the native state and 
iii) the hydrophobic collapse mechanism where tertiary hydrophobic contacts initiate the 
secondary structures. In all cases, a mix of secondary and tertiary structures precede the 
transition state and the precise folding mechanism may strongly depend on each specific 
case. 



4.1.2. Designed heteropolymers 

It is numerically possible to design heteropolymers, with non-covalent random inter- 
actions, that show a protein-like behaviour [Sha94. Sha93] . To this end, let us consider 
a heteropolymer on a cubic lattice whose sequence is composed by N monomers m,-, 




m lis 1/T 

FIGURE 14. The modified Zwanzig model and the bell-shape curve of the folding time. The original 
Zwanzig model (see Fig.|4|i can be modified (left picture) in order to take into account a native state shifted 
from the bottom of the valley — 0. The corresponding folding time as a function of the temperature is 
reported on the right figure. In this figure e — 0.5 and m — 5. 



i=\...N. The interaction energy Etj of two adjacent and non-covalent monomers, and 
nij, is supposed to be a random quenched (i.e. fixed during all the procedure) variable 
with zero mean value and a variance 1 -this sets the energy unit. From the "residues" 
and the matrix Eij, the following design procedure leads to an heteropolymer that folds 
into a compact structure S. Given a sequence of the monomers defined by their position 
along the chain, one permutes two of them (which corresponds to an exchange mutation 
in an evolutionary terminology) and accepts the permutation if the total energy of the 
compact structure S decreases. At the end of this annealing procedure in the primary 
sequence space, one generally gets an heteropolymer that folds quickly [Sha94l IShaQB*] . 
Moreover, at sufficiently high temperature, two-state behaviour is often observed. The 
dynamics is usually a "coin and cranks haft" Monte-Carlo type with Metropolis accep- 
tance rate, known to be ergodic iHil75|l . A hint to understand the propensity to fold is 
the presence of an energy gap between the native state and the lowest (in energy) mis- 
folded states LSha94J . However, the existence of a gap is not a sufficient condition for the 
molecule to fold since a flat energy landscape with a single local minimum energy state 
(i.e. a golf hole course) does not lead to a fast folder. As a consequence, it is reasonable 
to think that the annealing procedure in the primary sequence space indirectly designs a 
funnelled energy landscape and not only a single thermodynamically stable state. 

In numerical studies, one has access at any time to the total number of native contacts, 
the number of native contacts of each monomer, the structural overlapping (that quanti- 
fies the matching of the relative position of distant monomers), the end-to-end distance 
and the gyration radius. Such an amount of information has led to a good understanding 
of such systems. For instance, it has been shown that the folding rates are correlated 
with the parameter o = \Tq — Tf\/TQ iKli96ll . Tq and Tf being respectively the Flory 
coil-to-globule transition and the melting temperatures. The latter determines the first- 
order transition between the denatured and the native states. It has also bee ri shown that 
the size-dependence of the folding rates is s ensitive to the degree of design llGut9i]. Re- 
sistance to mutations has also been studied llBro99ll . the main result being that the latter 
directly depends on the magnitude of the energy gap. By further adding random interac- 
tions to Eij, and by including hydrophobicity, the phase diagram in the temperature and 
denaturant concentration (related to the amount of extra disorder) phase has revealed the 



presence of a thermodynamic transition line between compact native structures and coil 
states but also between native and compact denatured states as suggested by experiments 
[iFin02]. The latter are good candidates to be intermediate states to the folding. 



The bell-shape of the folding time. The above designed heteropolymers lead to a 
folding time that exhibits the bell-shape temperature dependence observed in experi- 
ments (Fig. [14]) [Onu97]. The origin of this non-monotonic behaviour can be twofold. 
First, it can be due to the roughness of the energy landscape, which becomes the limiting 
rate factor when the thermal energy is on the order of the energy barriers separating the 
multiple configurations associated to the denatured state [Bry89,'Onu971. The simplest 
corresponding model describing this scenario is due to Zwanzig [Zwa95]. By intro- 
ducing, in the microscopic time scale of the original model of FigjH a multiplicative 
Arrhenius factor exp{AE/kBT), one recovers a non-monotonic behaviour for the fold- 
ing time. The argument is as follows. At high temperature (entropic regime), the folding 
time is large because of the high entropy of the denatured states (see Fig. |4]). At low 
temperature, the folding time is large because of the trapping of misfolded states whose 
presence is reflected in the modified microscopic timescale. The life-time of the latter 
is on the order of exp{AE / IcbT) . Second, it can also be the manifestation of a crossover 
between a regime dominated by entropic effects (high temp erature) and a regime domi- 



nated by activated events not related to any glas s transiti on [|Gut98ll . As an example, let 



us consider the picture as proposed by Zwanzig [|Zwa95l] . Instead of taking into account 
a native state at the bottom of a potential energy valley, we can define a native state 
shifted with respect to the bottom of the valley. If the native state is shifted to a distance 
m (Fig. [Ml), a calculation th at assii mes partial equilibration out of the native state leads 
to a folding time T = f{m) [ Jun06ll that has a bell-shape as reported in Fig. [141 In gen- 



eral, the folding time is given by T = g{n*) where n* is the average distance between 
the unfolded state and the native state. The minimum folding time then corresponds to 
n* = m. At high temperatures, the entropy favours large n* whereas at low temperature, 
R* ~ dominates. In this case, the dynamics is activated and one finds an Arrhenius law 
T ~ exp(2me/r). 

Force-induced transitions. A few numerical investigations of desig ned het eropoly- 
mer sequences in the presence of force have been done. The study in [ Soc99ll has re- 



vealed a tricky interplay between different reaction coordinates, e.g. the end-to-end dis- 
tance and the number of native contacts. This reminds eventual problems in interpreting 
the diffusive dynamics in a projected free energy landscape. 

More generally, interesting investigations by Geissler and Shakhnovich [' Gei02l] have 



shown that stretched designed heteropolymers should behave differently than stretched 
random heteropolymers. In particular, they argue that only protein-like sequences would 
reproducibly unfold and refold at a specific force. Also related to the protein-like be- 
haviour, it has been shown that a simple stretched polymer at a temperature smaller than 
its -temperature (the coil-to-globu le tra nsition) leads at some force to the formation of 
the a -helix secondary structure [Mar03|l . 

Stretched designed heteropolymers on-lattice can illustrate the presence of on and 
off pathway states. Indeed, in some conditions of temperature and force, a three-state 
behaviour can be observed |Jun06)1 -see Fig. [151 We then carried out the force-protocol 





FIGURE 15. Three-state behaviour in a simulation of an heteropolymer on lattice. In this simulation, the 
mechanical force / is incorporated by adding a mechanical energy of the type — || / || x || rend-to-md II- 
This is diff erent fr om the usual scalar product in order to prevent the geometrical effects of the lattice 
(details in lJun06ll ). The upper panel shows the temporal evolution of the end-to-end distance and the 
lower panel shows the corresponding evolution of the percentage of native contacts. 




FIGURE 16. Left: the distribution of folding times from the intermediate states suggest a case where 
only on-pathway states are present. Right: in contrast, off-pathway-states are characterized by a peak at 
very large time (here 2 x 10^). This peak actually corresponds to a cut-off in the simulation and would 
theoretically correspond to an infinite time. In this example, one finds 48% of on-pathway states and 52% 
of off-pathway states. 



described in section 13.41 to quantify the fraction of misfolded states with respect to 
the on-pathway states. To this end, each time the system reaches an extension and a 
percentage of native contacts compatible with an intermediate state, the force is set 
to zero and the distribution of folding time from this very moment is computed. In 
small sized systems, fluctuations are important and an unfolding event at zero force 
is always observed. As a consequence, we numerically constrained the syster n to sta y 
in the phase space region corresponding to the intermediate state (details in [ Jun06|] '). 
In a situation with only on-pathway states, the distribution is nearly exponential as 
reported in Fig.H^left). When off -pathway states are present, the distribution consists of 
two well-separate contributions (Fig. [T6lf right)). The peak observed at very large times 
is a numerical cut-off time after which we decided to stop the simulation. This peak 
corresponds to the off-pathway states. 



4.2. Coarse-grained models 



Coarse-grained simulations allow us to investigate folding kinetics up to thousands of 
microseconds. This is not possible by using standard molecular dynamics due to limited 
computing power. The underlying reason to deal with coarse-grained models is the belief 
that microscopic details are not determinant to understand the folding process [BakOO]. 
Usual coarse-grained procedures [BakOO, Guo95, Zho99] are inspired from simple Go- 
like models such as the HP models. These involve an off-lattice dynamics of only the C" 
carbons of the polypeptide chain. A typical model considers three types of car bons (o r 



beads in the literature) that can be hydrophobic B, hydrophilic L or neutral N iVei97n . 



The energies involved can be divided into two parts: local and non-local. The local 
contribution accounts for the covalent bonds and takes into account harmonic bonds and 
angle potentials, and a dihedral angle potential chosen to fa vour different orientations 
according to the surrounding secondary structure [ Vei97L Zho99,1 . Non-local interactions 



count for the non-bonded interactions that are responsible for the folding mechanism. 
They are usually described by a Lennard- Jones potential and only interactions between 
hydrophobic pairs are taken into account. These models allow to study the statistical 
properties of the folding process starting from unfolded configurations, in contrast to 
the molecular dynamics simulation. Several resu lts have been obtained about thermal 



folding [BakOO]. For instance, Zhou and Karplus [|Zho99n have shown that a wide range 
of mechanisms could be observed in small helical proteins just by playing on the energy 
difference between the native and the non-native contacts. 

In the spirit of the Go-like models, let us also mention the use of mesoscopic elastic 
models which provide insight on protein dynamics and folding/unfolding pathways 



[lMic02l] . At variance with other approaches, the strength of the non-covalent bonds 



depends on the temperature. This has led to identify some interesting differences with 
random heteropolymers, e.g. the structural regions involved in slow m otions f or protein- 



like models are much more extended than in random heteropolymers [|Mic02|| . 

In the presence of force, unfolding pathways seem to be mainly related to the structure 
of the native state [KliOO]. However, since the study of mechanical properties of proteins 
is still in its infancy [Ben96, Bus03], it would be rather audacious to say that one can 
in any case deduce the mechanical properties from the structure. Two major combined 
difficulties actually make this investigation difficult: 1) upon mechanical stretching, it 
is not clear how the network of forces is distributed inside the protein (for instance, 
three body interactions are numerous in the native state [Ejt04.] ). and 2) the mechanical 
properties at the single-residue level are not known. 

The instructive RNA case. A seemingly simpler problem is the one of small RNA 
hairpins. Indeed, in good solvent conditions, the native structure corresponds to the sec- 
ondary structure. The three dimensional structure is "only" constrained by the helix ar- 
rangement of the different base-pairs. Furthermore, the secondary structure is stabilized 
by stacking interactions whose values are well known [Jae93]. 

By adopti ng a sim ilar coarse-grained model to the one described above, Hyeon and 



Thirumalai [|Hye05l] have studied in detail the differences between force and thermal 
induced unfolding. They have also studied the folding transition by using a force and 
a thermal jump protocols. Their study is extremely valuable since such protocols, es- 



pecially the fo rce jump experiment, have been realized in single- protein experiments 
[lFer04L lEeeOOl] and in single RNA hairpins experiments as well f Pan06ll . Their RNA 
coarse-grained description is composed of three beads that respectively correspond to 
the phosphate, the ribose and the base groups. A dihedral potential accounts for the 
right-handed chirality of RNA and a stacking stabilization potential is incorporated. Hy- 
drophobic interactions between bases are described by a Lennard- Jones potential en- 
dowed with a distance cut-off and a Debye-Huckel electrostatic potential is introduced 
to describe the interaction between the phosphate gr oups. T hey use an overdamped 
Langevin dynamics and find, as in DNA experiments llWarSSn . that thermal denatura- 
tion is due to the melting of the hairpin where each base-pair fluctuates independently. 
In contrast, mechanical denaturation occurs by sequential unzipping of the hairpin. An 
interesting prediction is that the refolding mechanism after a temperature quench should 
be different from that after a force quench. In particular, they find that the folding times 
upon force quench from stretched states, are much larger than those upon temperature 
quench from random states. They explain this phenomenon by the fact that stretched 
conditions make the molecule explore domains of phase space that are inaccessible at 
high temperatures by random coil configurations. Interestingly, such a statement is also 
valid for proteins and such reported differences are expected to occur in proteins as well. 



5. CONCLUSION 

The increasing number of single-protein experiments is providing insight on the in- 
ner details of the protein-folding problem. More generally, the combination of single- 
molecule techniques, with and without force, provide new quantitative results that can 
be rationalized with existing theories. In the long term these experimental results will be 
useful to better understand the basic mechanisms underlying many biological processes 
at the molecular and cellular level. 

The combination of detailed numerical simulations and experiments has sharpened 
the theory underlying the propensity of proteins to fold. In particular, it has confirmed 
the funnel-like shape of the energy landscape without excluding well defined steps in 
the successive stages of the unfolding/folding transition. The confrontation of differ- 
ent experimental single-molecule techniques using mechanical force, on one hand, and 
fluorescence techniques, on the other hand, raises new interesting questions about the 
nature of the early intermediate states that form during the folding process. The follow- 
ing questions can now be answered with the recently available techniques: Under which 
conditions can mechanical force stabilize different intermediate states? Is the interme- 
diate state observed under force the same as the early state that forms without force? 
In order to answer these questions, future design of experiments is needed to obtain 
structural information about the intermediate states, the transition states but also the de- 
natured states. 

An alternative approach is to consider protein-like models on-lattice that show some 
of these behaviours. It is then possible to investigate the different scenarios in a given 
protein and clarify whether the two intermediates, that are found with and without force, 
are the same or not. In this regard, the mechanical response studied at various forces, or 
by applying the force at different locations along the polypeptide chain, is expected to be 



different depending on which scenario is correct. Finally, further simulations of coarse- 
grained protein models, in conjunction with experimental measurements, might lead to 
improved models that faithfully reproduce the unfolding/folding pathways of proteins 
with and without force. 
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